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We consider random walks that start and are absorbed on the leaves ol random networks and 
study the length of such walks. For the networks we investigate, Erdos-Renyi random graphs and 
Barabasi- Albert scale free networks, these walks are not transient and we consider various approaches 
to computing the probability of a given length walk. One approach is to label nodes according to 
both their total degree and the number of links connected to leaf nodes, and as a byproduct we 
compute the probability of a random node of a scale free network having such a label. 



I. INTRODUCTION 

Random walks have played a significant role in physics over the last century and more recently their properties on 
random networks have received attention as they can model transport properties of complex systems. In this work we 
consider random walks that start on leaf nodes of the random network and are absorbed whenever they reach another 
leaf node. For certain types of network transport this is a natural restriction. For example, while diffusion has been 
proposed to model communication traffic on the internet [IH3; the majority of such traffic is between hosts lying at 
the edge of the network and would be better modelled by the type of random walk we consider in this paper. 

Past studies of non-absorbing random walks on random networks 0HH] have often concentrated on the mean first 
passage or hitting times [9;. Typically, the mean first passage time scales with the size of the network and the walk 
is transient in the thermodynamic limit. In contrast, at least for the networks we have studied, we find that the 
probabilities of given path lengths between leaves are independent of the size of the network. Moreover, the walks 
we study are often short and techniques that rely on diffusion reaching equilibrium are not obviously adaptable since 
the largest eigenvalue does not have time to become dominant. Besides averaging over networks of a given class, we 
average over many walks in order to find representative results for a given type of random network. For a long running 
non-absorbing random walk, the equilibrium node occupation probability is proportional to the degree of that node 
[4j. Even after averaging over walks, this is not the the case for walks between leaves as links from leaf nodes must 
be treated differently. One way to do this is to consider an approach based on labelling nodes according to both 
their total degree k and the number of links I connecting to leaf nodes. We propose a technique in which the average 
occupation probabilities are proportional to the number of non-leaf links besides adapting other methods that appear 
in the literature. 

The networks we have chosen to study are the traditional Erdos and Renyi (ER) 10J random graphs and the scale 
free models due to Barabasi and Albert (BA)|llj. There is consensus that the scale free networks model real networks 
better than ER random graphs. On the other hand, random graphs are more analytically tractable, principally due 
to their lack of correlation between node degrees at each end of a link. Since it is essential that our networks have 
leaf nodes, we work in the sparse regime of the ER graphs and choose the simplest BA graph. 

Our main interest lies in the probability, pt, for a walk of length t between leaves. For short walks that can be 
enumerated, this probability can sometimes be computed exactly and displays differences between even and odd length 
walks due to the possibility of being absorbed by the originating leaf node. We expend some effort on these short 
walks, but are also interested in longer walks. For ER networks, the probability of longer walks decays exponentially 
but the correlations and lack of homogeneity of BA networks destroys this simple behaviour. Despite the hierarchical 
structure of the internet, data for the fraction of packets arriving at an edge node after a certain number of hops 
displays an exponential decay over a range up to about 30 hops. The most successful techniques to estimate the rate 
of decay of the probability for long walks are based on a modified assumption of the equilibrium occupation of nodes 
labelled by (k,l), the degree and the number of links to leaf nodes. 

We commence with some simple and instructive network models to develop intuition for the process. The next 
section covers Erdos and Renyi random graphs and we provide both analytic and simulation results. We then move 
on to discuss the problem on a scale free network: the m = 1 BA model with leaves. The conclusion returns to discuss 
the motivating example of the internet and summarises our results. 
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II. SIMPLE NETWORKS 



To commence, recall the random walk |12j on the half line in which the walker starts by taking a step to the right 
from the origin, and is absorbed if it returns there. At each subsequent (discrete) time step the probability of moving 
one step to the right is p and that for a step to the left is 1 — p. The recursion relation for the occupancy probability 
can be solved with the help of a Fourier transform, and a generating function approach yields an expression for the 
first passage probabilities [9]. Paths that return to the origin always take an even number of steps and these paths 
typically involve many backtracks or reversals. Counting these paths amounts to a tree enumeration problem which is 
implicitly solved by the generating function computation. The first passage probabilities, r t , can be expressed using 
a Catalan number as. 
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The sum over probabilities for all length paths is related to the generating function for Catalan numbers. This involves 
a square root and by taking the appropriate sign we find that the overall probability of return is given by. 
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So if the walk is biased to the right, the overall probability that it will ever return is less than 1 and this behaviour 
is said to be transient. Of course, transience can only occur in the limit of an infinitely large network. 

Asymptotically, for long paths, the probability of a path of a given length decays exponentially with the exponent 
7 = -(l/2)log4p(l-p). 

lim r t = J ± r (4p(l - p)f 2 = j^e-'t (3) 

The case p = 1/2 is special and gives rise to power law decay. 

The case on a finite line with two boundaries or leaves is treated in [13] . but this does not give rise to any new 
concepts. 

Now consider a Cayley tree. The degree of vertices is taken to be constant k and rather than follow the precise 
location of the random walk, we track the level within the tree. We assume a Cayley tree with a large number of 
levels, but start from the boundary level which is the only level containing leaves. The probability that the walk 
moves one level towards the root is p = 1/k while the probability it moves towards the boundary is 1 — p. In terms of 
levels, the rest of the analysis is identical to that of the ID random walk just considered, though in this case the real 
tree has many possible leaves and walks that start and end on different nodes are accounted for. For integer values 
of k, this walk will never be transient. The exponent 7 = —(1/2) log(4/fc(l — 1/fc) does not depend on the size of the 
graph, provided it is large enough. Note that a non-absorbing random walk from the root of an infinite Cayley tree 
is transient with return probability l/(fc — 1) for k > 3 [6 . 

The final simple network we consider is a modification of the random regular network to allow the presence of 
leaves. This network is constructed by first taking an ordinary random regular network in which all nodes have the 
same degree and links are connected along the lines of the configuration approach of Molloy and Reed [2]. Then, to 
each node, an additional I links are attached each ending in a leaf. Each of the non-leaf nodes then has total degree k 
with I links to leaves and k — I links to other non-leaf nodes. From the regularity of the problem it is straightforward 
to see that the probability of a random walk starting and ending on a leaf and making t steps is. 

Pt= k{—) (4) 

Again the decay is exponential and the exponent 7 does not depend on the size of the graph. By summing these 
probabilities we find that the walk is never transient and that it has mean length 1 + k/l. 

This study of simple networks suggests that the probability of longer walks decays exponentially for homogeneous 
networks and that the exponent is not dependent on the size of the network. Imagine that a walk has survived until 
step t, then the probability, (3, that it will survive one more step is just that of avoiding a leaf node. 

(3 = e~ 7 = 1 — P(transition to a leaf node) (5) 

The decay is exponential when the transition probability does not depend on the length of the walk, and this is the 
case for homogeneous networks. We will estimate the transition probability for ER and BA networks in the following 
sections. 
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III. ERDOS-RENYI RANDOM GRAPHS 



It is more awkward to analyse random walks on networks in which the degree distribution is not fixed since the 
walk learns information about the structure of the network instance as it proceeds and this memory should be taken 
into account for backtracks. The most tractable models to consider are the Erdos and Renyi [TU] (ER) random graphs 
due to both the absence of correlations between node degrees and their locally tree-like structure. Indeed, since this 
is the natural model, random walks on ER graphs have been studied extensively; but only relatively recently has 
the hitting time, which scales with the size of the graph, been computed [B]. One of the difficulties with traditional 
random walks on ER graphs has been that the interesting issues of transience only occur if the walk takes place on 
the giant component which is only present above the percolation threshold at (k) =1. Early work [5] avoided this 
difficulty by working in the dense regime where with high probability, all nodes belong to the giant component. A 
mean field like approach [7j allows accurate estimates of the mean first passage time, but is more suited to numerical 
than analytic expression. 

O ► • ► • ► • ► O 




FIG. 1: An enumeration of the three paths that contribute to a length 4 random walk between leaf nodes. Open circles represent 
leaf nodes and filled circles are intermediate, non-leaf nodes. The contribution of each path is given in equation (JsJ) . 

In studying ER graphs we use the following notation. The probability of a node picked at random having degree k 
is Poisson distributed. 

n k = <|V<*> (6) 

The probability that the node at the end of randomly chosen link has degree k must include a factor for the number 
of links leading to that node. 

(k) [<) 

Short paths can be enumerated, for example there are three t = 4 length paths shown in figure [I] By combining 
the probability of reaching a node with given degree with the random walk factor for leaving it, these paths are seen 
to contribute respectively. 
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The sums can be written in terms of exponential integrals and are used to check numerical simulations. This diagram- 
matic approach is restricted to very short paths as no simple recursion exists and the number of diagrams increases 
rapidly. 

Numerical studies indicate that the walks between leaves that we consider in this paper have the same properties 
as on the simple models: they do not scale with the size of the system and their probability decays as shown in figure 
[2] For these leaf random walks there is no need to restrict the starting leaf nodes to be part of the giant component 
and we include the contribution from walks on the finite clusters. Below the percolation transition, where only finite 
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FIG. 2: The probability of a random walk between leaves of length t on an ER graph. Shown for various values of (k), the 
mean degree . 



clusters exist, the decay suffers strong finite size effects and is not exponential. Above the transition, the decay is 
exponential and in figure[3]we show the exponent as a function of the mean degree parameter (k) of the model. These 
numerical results are obtained from following 10000 random walks on each of 100 graphs of size 8000. The number of 
random walks followed must be large both at small and large (k). At small (k) there are many possible leaf nodes to 
start from, though the walks tend to be short. At large (k) there are not so many leaf nodes, but the walks can take 
many different paths. These results do not change as the network size is increased suggesting that transience does 
not occur. 
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FIG. 3: The decay exponent for random walks between leaves on an ER graph shown for various values of the mean degree 
parameter. The dotted line shows the rough estimate of the the exponent from equation The continuous line is the result 
of a more sophisticated argument given at the end of section |III A| 



The behaviour of the exponent at large (k) can be estimated following the argument in equation For large (k) 
we ignore memory effects and estimate the probability that a walk will survive one more step as. 

'■Sw-'i- 1 -^ (9 » 

The exponent is thus 7 = — log(l — e~( fe )) or approximately e - ^ for large (k). This estimate is shown in as a dotted 
line in figure [3] and indeed matches the simulations for large (k). In a later section we improve this estimate. 

We now proceed to discuss more sophisticated approximations, with the aim of improving the prediction of walk 
survival probabilities in the region where the exact enumeration in not feasible, but the exponential behaviour has 
either not taken over, or the estimate above is not accurate. 
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A. Labeling according to (k, I) 



Since the identification of leaf nodes is crucial for the random walks we study in this paper, we first propose an 
approach based on labelling each node according to both its overall degree k and the number of links to leaf nodes 
For ER graphs it is straightforward to write the probability, q^i, that a node chosen at random has label {k, I). 



qu = 



(*) e -<*> (!-.-«) 



The sum over leaf links of course obeys. 
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(10) 



^ qki = n k 
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The expectation value for the number of links to leaves found on a randomly chosen node is: 
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In order to proceed, we first identify the probabilities with which a walk will find a (fc, I) node. The probability, Vki, 
that the link from a randomly chosen leaf node connects to a (k,l) node is: 



Vkl 



(I) 



hki 



Similarly, the probability, Wki, that a randomly chosen link from a non-leaf node connects to a (fc, I) node is: 



Wkl 



(k - l)qki (k - l)qu 
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(13) 



(14) 



It is now possible to write down probabilities for given length paths using the simplest form of approximation in which 
there is no memory and backtracks are not accounted for. The separate contributions at each step of the walk from 
the transition probability to a given degree (k, I) node and from the random walk are clearly exposed. 
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Where the final line is valid for any t > 3. The predictions for paths of length 1 and 2 match the exact enumeration 
results and that for length 3 differs from the exact result by a factor of (1 — e~^). However, the prediction contained 
in the last equation above, for the exponent of the decay is less successful. In the limit of large (k) this prediction 
differs by a factor a 2 from equation (19j) which suggested a value of and according to figurepkhe earlier argument 
was correct. Moreover the failure extends to transience as the sum of probabilities is 1/2 in the large (k) limit which 
does not match numerical expectations. Given these inadequacies, we do not show a figure with the predictions from 
this approximation and conclude that while neglecting both memory effects and backtracks is possible for very short 
paths, it fails for longer paths even at large (k). 

We can reproduce the simple prediction given in equation ^ for the asymptotic decay exponent using the (fc, I) 
formalism. Assuming that after some steps, the walk is on a node of unspecified degree, then according to equation 



(14) the probability that its next step takes it to a node with label (k,l) is Wki- In this case the probability that the 
walk is not absorbed in the next step is. 
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So this argument yields precisely the same exponent as the rough one. We can obtain a better estimate of the exponent 
by assuming that when averaged over many leaf walks, the occupation probabilities take values similar to those 
attained by an non-absorbing walk in equilibrium. For traditional walks the equilibrium occupation probabilities are 
proportional to the number of incoming links: krik/(k} [3]. For leaf walks the natural generalisation of this equilibrium 
occupation probability is proportional to the number of non-leaf connections. So the wm not only represent transition 
probabilities, but when properly normalised can represent the occupation probabilities of (fc, I) nodes for k > 2. 
Numerical investigations that track which sites are visited suggest that such an assumption is valid in the large (k) 
limit. With the appropriate normalisation to ensure the correct sum of occupation probabilities over non-leaf nodes, 
the estimate of the occupation probability of a (k, I) node is. 

wu _ (k - l)qu , 17 . 
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The probability of stepping from a (fc, I) node to another non-leaf node is simply (fc — l)/k, so overall the probability 
of surviving one step is. 

8 _ 1 "{k-lf e-W( W -i + e -W) 

' (fc)(l-e-W) 2 k m (fc)(l-e-<*>) [ ' 

In the limit of large (k) this reproduces the rough result for the exponent in equation ([9]), but as can be seen from 
the continuous line in figure [2j this prediction matches the simulation results slightly better. 



B. Memoryless Approximation 

Since the paths are typically short compared to the size of the network, the spirit of this paper is to use methods 
that approximate the path enumeration to make it tractable. An interesting, though flawed, paper in this context is 
that of Masuda and Konno [5j . These authors write recursion formulae for the return probabilities of a non-absorbing 
walk to an arbitrary node using an approximation that takes memory into account within each backtrack, but regards 
each separate backtrack as a new path. The approximation is expected to be successful at large (k) . Unfortunately 
the paper [5] does not take into account the factor of the number of links that weights the probability of finding a 
certain degree node at the end of a randomly chosen link, and this compromises their results. Nonetheless the defect 
is simply remedied and their approach can be modified for our random walks between leaves. 
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FIG. 4: Diagrammatic representation of the recursion used in the memoryless approximation. The top diagram is for the first 
passage probability without absorption, r n , and the lower diagram is for the probability of being absorbed at some other leaf, 
q n . The dashed lines represent these quantities and the solid lines are single steps between nodes. The open circles are leaf 
nodes and the solid circles are non-leaf nodes. The sum over partitions is represented with the dotted arc. 



By considering the top diagram shown in figure [4] the probability of first passage to a node after 2t steps without 
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being absorbed, rt, obeys the following recursion. 



a=l {U}Y,ti=ti=l 



(19) 



The sum is over all integer partitions of t including all possible orderings of those partitions. The significance of 
the partitions is that they represent the length of each backtrack shown in figure [4] In equation (19), this sum over 
partitions is decomposed into a sum over the number of lobes or backtracks in the diagram (corresponding to the 
order of the partition), and a sum over partitions into precisely this many lobes. The parameters m a are correctly 
given by. 
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where the lower limit of the sum imposes the condition of no absorption and this is the constitutes the only difference 
in computing r t with [8]. The form (19 1 illustrates the nature of the approximation as the memoryless nature consists 
in regarding each walk on a lobe as independent. 

A generating function approach is useful and we write for these first passage probabilities without absorption. 



R(x) 



*=i 



(21) 



Later, when paths that are not of even length are taken into account, we will need to consider R(x 2 ). In terms of 
R(x) the recursion takes on a more convenient form than equation (19). 



R(x) = x^TO a+ i {R{x)) a 



(22) 
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The probability that a walk lasts t steps before being absorbed by a different leaf node, qt, also obeys a recursion 
relation shown diagrammatically in the lower part of figure [4] The recursion is slightly different for even and odd 
values of t, but can be combined and is conveniently expressed in terms of the generating function. 



Q(x) = J2 
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(24) 



The first coefficient is qi — n±/(k) which for ER graphs is simply e ( The parameters s a arc defined as. 
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In some cases, it could be argued that the lower limit of this sum ought to be increased, but to keep track of these 
cases would be against the spirit of the approximation. For ER graphs, the parameters s a and m a are related by. 



s a -i ~ rn a _i 



(26) 



While Masuda and Konno invert equation (22) to obtain explicit expressions for the first passage probabilities in 



terms of sums over partitions [15j , we have found using the recursion relations themselves to be numerically convenient 
and speedy. Using algebraic manipulation software, the equation for the generating function (22) conveniently encodes 



all the separate equations for each coefficient and can be solved quickly and accurately. The solution for R(x) can then 
be used to provide values for the probabilities qt using the equation below. The combined probability of absorption 
after t steps is the sum of the contributions from r* and qt where the first term only contributes for even length paths. 
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(28) 



The second line is specific to ER graphs as it uses the relationship between parameters in equation (26) 

and Si = 1 — p~( k ) 



At X = 1 

we find R{1) +Q(l) = 1. So 



this equation simplifies and by using the explicit forms of q\ = and s\ = 1 — e 

at the level of this approximation, leaf walks are never transient on ER graphs. Since the approximation is expected 
to improve as (k) increases, and this is the regime in which transience is most likely to occur, the result provides 
strong evidence that transience does not occur. The result for the full generating function is helpful as it avoids the 
need to calculate the parameters s a , which is otherwise the most time consuming part of the solution. The alternative 
approach of solving the generating function R(x) by iteration and then obtaining the coefficients from residues, works 
for small i, but becomes slow and unstable at larger t. 
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FIG. 5: The probability for random walks between leaves on an ER graph shown for (k) — 3. Points with error bars are from 
simulation and the crosses are predictions from the memoryless approximation. 

Figure [5] shows the predictions of this memoryless approach against the results of simulations for a value of (k) = 3. 
Although this value is not very large, agreement is reasonable. The first three points are identical to the exact 
prediction but later points tend to track the curve for odd length walks. At larger values of (A;) the agreement is even 
better. At (fc) = 1, the predictions are successful for t < 10, but for longer paths, the theory fails to capture the 
observed behaviour. 

This approximation provides an improvement over the techniques used in the previous section as it contains in- 
formation about backtracking paths. For smaller values of (k), where backtracks and memory effects become more 
important the approximation fails. 



IV. SCALE FREE NETWORKS 



Scale free networks (BA) [11] created though a growth process involving preferential attachment provide a popular 
alternative model of random networks. We consider the simplest version of this system which according to the 
classification of |llj is the m = 1 model with preferential attachment and with an initial network consisting of a pair 
of nodes connected by a link. The growth process leads to a directed network with (k) — 2 and with 2/3 of the nodes 
being leaves. The degree distribution is given by [T7] . 

4 

Uk = fc(fc + l)(fc + 2) (29) 

Though if we take into account the age of nodes, older nodes tend to have larger degree while most leaf nodes are 
young. In general, these networks suffer from strong finite size effects [18] . 

Due to the way in which they are created, this kind of scale free network is not homogeneous, has strong correlations 
and in these respects is very different from the ER networks. The directed nature of the network is crucial in 
understanding the correlations The fraction of nodes of degree k that attach to an ancestor node of degree I (the 
degree of the ancestor, I, must be greater or equal to 2), is [T7] 

p = 4(1-1) 12(?- 1) 

' fe( k(k + l){k + l)(k + I + l)(k + I + 2) k(k + l-l)(k + l)(k + l + l)(k + l + 2) [ ' 



9 




5 10 15 20 25 30 

Length of walk (t) 



FIG. 6: The probability for random walks between leaves on a BA graph. Points with error bars are from simulation based on 
20000 walks on each of 100 graphs of size 100000. The continuous line is a prediction based on labelling nodes and is explained 
in the text. 



Random walks on this network have no knowledge of the direction of the links, but a full analysis of the walk should 
take this into account. 

Numerical results are shown in figure [6j There is no evidence for transience as the longest walk observed is much 
smaller than the size of the network and shows no change as that size is varied. Exponential decay of the random walk 
probabilities is not apparent for this network. This is in contrast to ER networks above the percolation threshold, 
fn figure § the curves for graphs with (fc) below the threshold, do not display exponential decay, but this should be 
attributed to a finite size effect from individual clusters. The BA network is simply connected and investigations of 
the dependence on network size allow us to rule out finite size as a cause of the curvature. Longer paths are more 
likely than would be expected from a fixed exponential decay. This is due to the correlations and non homogeneous 
nature of the BA network. Krapivsky and RednerflT] compute the age dependence of the degree distribution and 
support the notion of a highly connected old core to the network with high degree nodes. Long paths access this old 
core and the random walks that access this region tend to stay there and are less likely to be absorbed. 



A. Labelling according to (fc, I) 

The framework afforded by labelling nodes by (fc, I) is helpful to make predictions for BA networks despite reserva- 
tions about its accuracy in the context of ER networks. In this section we compute the probability that a randomly 
selected node on this BA network has degree k and I links to leaf nodes. In fact, to simplify notation, we compute 
q kp where p = k — I is the number of links to non-leaf nodes. By considering the growth process, we find a recursion 
relation for the expected number of labeled nodes which when expressed as probabilities becomes. 

(2k - p + 2) q kp = (k - 1) q k -i p + (k-p + l) q kp -i (31) 

When combined with initial conditions qn = ri\ and q kk +\ = qko = this difference equations allows all higher terms 
to be computed. However, an explicit solution requires some work. 

Based on explorations for small values of k,p we partially disentangle the two indices by writing. 

^ = P -^(2^-^W (32) 
Where P n (k) and Q n (k) are polynomials in k of degree n. These obey the following difference equations. 

(2k - p + 2) Pp_i(fe) = (2k + 1) P p _ 1 (fe - 1) + (fc - p + 1) P p - 2 (k) (33) 
(2fc-p + 2)Q p _ 2 (fc) = (2fc)Q p _ 2 (fc-l) + (fc-p+l)Q p _ 3 (fc) (34) 

The first equation can be solved immediately to give. 

P P -m = (p 2 lV {k+P k r 2)l (* + 2(p - I) 2 ) (35) 
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While the second equation has the more complicated solution. 

n m- 4 J< Y- fc/2 1 (k + P -2j + iy. 

Qp - 2[ ) (fc + 1) (2j - l)(2j - 3) (j - 1)1 (P - 2i)!(fc - i + 1)! 1 j 

The resulting qk p match numerical values obtained from generating many BA networks. They also obey the following. 

h 4 
I>p = "*=*(* + !)(* + 2 ) (37) 



p=i 

v^/, \ 2(Jfc- l)(fc + 6) 

X> " = = ^ + l)(fc + 2)(fc + 3) (38) 

2 (fc- l)(fc 3 + 13P+42fc-48) 

" = fc(fc+ l)( fc + 2)( fc + 3)( fc + 4) ^ 

On BA networks there is only one connected component and there are no walks of length one step. The probability 
of a randomly selected leaf node connecting to a (k,p) node is. 

qkp (40) 

With this information we can now predict the probability of a walk of length 2 steps. 

fc=2p=l 1 

- lT.lt ( fc -p)V- P (4i) 

fc=2 p=l 

= 3 — (48tt 2 - 419) = 0.570219 . . . 
2 144 v ; 

The first line exposes the separate factors for the transition and for the random walk, while the second line shows that 
this can be expressed in terms of the sums computed above in equation (39 1. Numerical simulations on 100 graphs of 
size 10 5 observe 0.57 ± 0.02, fitting the prediction well. 

We can proceed to estimate the probability of walks of length 3 and over by extending this argument. However, 
this requires knowledge of the correlations between the (k,p) values of linked nodes. We have not computed this four 
index quantity, but have made some estimates using the degree correlations in equation ( 30 ) . These estimates are not 
accurate and we do not present them here. 

Even though figure [6] clearly shows a curve, we can use ^ to estimate the survival probability. We expect the 
analogy of Wki, namely pqkp/{p), to represent the transition probability to a (k, I) site. Here the mean number of 
connections to non-leaf nodes is given by (p) = 4/3, so the probability of making a further step without absorption is. 

(P) (P) 2 (U 



k=2 p- 



The decay exponent that this argument suggests is therefore a constant 7 = log 2. We do not show this prediction in 
figure [6] as it is not very accurate and we can obtain a better prediction following the reasoning used for ER networks 
and leading to the estimate of the survival probability given in equation (18). To do this we assume that pqkp/{p) 
can be regarded as an occupation probability for non-leaf nodes. A factor of 2 is needed for correct normalisation and 
probability of survival on the next step is. 

^EEil^ ( 48 - 2 - 419 ) = °- 570219 ■ • ■ ( 43 ) 



We have not discovered any reason for the appearance of exactly the same factor as in equation ( 42 ) , and it seems to 
be a coincidence. 



If we assume that the argument holds for all length paths, and taking into account the lack of transience to 
normalise, we can predict the following form. 

= il ^P t ( 44 ) 

This form is shown in figure [6] and is provides a surprisingly good fit for shorter paths. As explained above, longer 
paths are affected by the correlations, enter the region of old nodes, and are more likely than this approximation 
predicts. 



B. Other approaches 



We have not attempted to extend the memoryless approximation along the lines of the one described in section 



III B for ER networks as the level of complication appears to be unwarranted. Such an approximation would have to 



take into account the correlations between node degrees and to do this needs to delve into the underlying directed 
nature of the BA graph. The return probabilities would depend on the degree of the originating node. 



V. CONCLUSION 



This study of the problem of random walks between leaf nodes of random networks was initiated by the desire to 
model internet traffic, but it turns out that these walks probe network structure in a way that is not possible for 
traditional non-absorbing walks. The mean first passage time is a natural observable for non-absorbing walks, but 
to characterise classes of graph it must be averaged over all pairs of nodes and this reduces the information it can 
provide about the non-homogeneity of a network. By contrast, the random walks between leaves studied here provide 
a natural subset of nodes over which the mean first passage time can be averaged and this can give information 
about the heterogeneity of the network and the extent to which edge nodes are connected to the rest of the network. 
The clearest example of the way that walks between leaves can probe network structure was the distinction between 
exponential decay of the probability distribution of the length of the walk on homogeneous ER networks and the non- 
exponential decay for heterogeneous BA networks. This however leads to a conundrum: the internet is hierarchical, 
yet the probability distribution of time-to-live (TTL) values captured on the internet does not display any systematic 
deviation from exponential. This may indicate limits to the utility of random walks representing internet traffic. 

Our analytic attempts to predict the probability of given length walks have relied on a variety of techniques. For 
short paths we have been able to enumerate, or devise approximations that allow such enumerations. For longer paths 
we have had some success with ER networks by assuming that the equilibrium occupation probabilities of a node are 
proportional to the number of its non-leaf links. Together with results for the proportion of (k, I) nodes, this approach 
gives some useful predictions for BA networks but does not capture the non-trivial decay of the walk due to intrinsic 
correlations in the network. 

Acknowledgment: I am indebted to Dr B. Ghita for collecting the TTL data for the internet that was the original 
motivation for this work. 
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